Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Ergodic search enables optimal exploration of an information distribution with guaranteed asymptotic coverage of the search space. However, current methods typically have exponential computational complexity and are limited to Euclidean space. We introduce a computationally efficient ergodic search method. Our contributions are two-fold: First, we develop a kernel-based ergodic metric, generalizing it from Euclidean space to Lie groups. We prove this metric is consistent with the exact ergodic metric and ensures linear complexity. Second, we derive an iterative optimal control algorithm for trajectory optimization with the kernel metric. Numerical benchmarks show our method is two orders of magnitude faster than the state-of-the-art method. Finally, we demonstrate the proposed algorithm with a peg-in-hole insertion task. We formulate the problem as a coverage task in the space of SE(3) and use a 30-second-long human demonstration as the prior distribution for ergodic coverage. Ergodicity guarantees the asymptotic solution of the peg-in-hole problem so long as the solution resides within the prior information distribution, which is seen in the 100% success rate.more » « lessFree, publicly-accessible full text available January 1, 2026
- 
            Free, publicly-accessible full text available January 1, 2026
- 
            We consider the problem of distributed pose graph optimization (PGO) that has important applications in multi- robot simultaneous localization and mapping (SLAM). We pro- pose the majorization minimization (MM) method for distributed PGO (MM−PGO) that applies to a broad class of robust loss kernels. The MM−PGO method is guaranteed to converge to first-order critical points under mild conditions. Furthermore, noting that the MM−PGO method is reminiscent of proximal methods, we leverage Nesterov’s method and adopt adaptive restarts to accelerate convergence. The resulting accelerated MM methods for distributed PGO—both with a master node in the network (AMM−PGO∗) and without (AMM−PGO#)— have faster convergence in contrast to the MM−PGO method without sacrificing theoretical guarantees. In particular, the AMM−PGO# method, which needs no master node and is fully decentralized, features a novel adaptive restart scheme and has a rate of convergence comparable to that of the AMM−PGO∗ method using a master node to aggregate information from all the nodes. The efficacy of this work is validated through extensive applications to 2D and 3D SLAM benchmark datasets and comprehensive comparisons against existing state-of-the-art methods, indicating that our MM methods converge faster and result in better solutions to distributed PGO. The code is available at https://github.com/MurpheyLab/DPGO.more » « less
- 
            Early research on physical human–robot interaction (pHRI) has necessarily focused on device design—the creation of compliant and sensorized hardware, such as exoskeletons, prostheses, and robot arms, that enables people to safely come in contact with robotic systems and to communicate about their collaborative intent. As hardware capabilities have become sufficient for many applications, and as computing has become more powerful, algorithms that support fluent and expressive use of pHRI systems have begun to play a prominent role in determining the systems’ usefulness. In this review, we describe a selection of representative algorithmic approaches that regulate and interpret pHRI, describing the progression from algorithms based on physical analogies, such as admittance control, to computational methods based on higher-level reasoning, which take advantage of multimodal communication channels. Existing algorithmic approaches largely enable task-specific pHRI, but they do not generalize to versatile human–robot collaboration. Throughout the review and in our discussion of next steps, we therefore argue that emergent embodied dialogue—bidirectional, multimodal communication that can be learned through continuous interaction—is one of the next frontiers of pHRI.more » « less
- 
            This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections. Existing methods learn from human magnitude corrections; since a human needs to carefully choose the magnitude of each correction, those methods can easily lead to over-corrections and learning inefficiency. The proposed method only requires human directional corrections — corrections that only indicate the direction of an input change without indicating its magnitude. We only assume that each correction, regardless of its magnitude, points in a direction that improves the robot’s current motion relative to an unknown objective function. The allowable corrections satisfying this assumption account for half of the input space, as opposed to the magnitude corrections which have to lie in a shrinking level set. For each directional correction, the proposed method updates the estimate of the objective function based on a cutting plane method, which has a geometric interpretation. We have established theoretical results to show the convergence of the learning process. The proposed method has been tested in numerical examples, a user study on two human-robot games, and a real-world quadrotor experiment. The results confirm the convergence of the proposed method and further show that the method is significantly more effective (higher success rate), efficient/effortless (less human corrections needed), and potentially more accessible (fewer early wasted trials) than the state-of-the-art robot learning frameworks.more » « less
- 
            This paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot to learn an objective function from a few sparsely demonstrated keyframes. The keyframes, labeled with some time stamps, are the desired task-space outputs, which a robot is expected to follow sequentially. The time stamps of the keyframes can be different from the time of the robot’s actual execution. The method jointly finds an objective function and a time-warping function such that the robot’s resulting trajectory sequentially follows the keyframes with minimal discrepancy loss. The Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of the robot trajectory with respect to the unknown parameters. The method is first evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to learn an objective function for motion planning in unmodeled environments. The results show the efficiency of the method, its ability to handle time misalignment between keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.more » « less
- 
            Abstract Intelligence involves processing sensory experiences into representations useful for prediction. Understanding sensory experiences and building these contextual representations without prior knowledge of sensor models and environment is a challenging unsupervised learning problem. Current machine learning methods process new sensory data using prior knowledge defined by either domain knowledge or datasets. When datasets are not available, data acquisition is needed, though automating exploration in support of learning is still an unsolved problem. Here we develop a method that enables agents to efficiently collect data for learning a predictive sensor model—without requiring domain knowledge, human input, or previously existing data—using ergodicity to specify the data acquisition process. This approach is based entirely on data-driven sensor characteristics rather than predefined knowledge of the sensor model and its physical characteristics. We learn higher quality models with lower energy expenditure during exploration for data acquisition compared to competing approaches, including both random sampling and information maximization. In addition to applications in autonomy, our approach provides a potential model of how animals use their motor control to develop high quality models of their sensors (sight, sound, touch) before having knowledge of their sensor capabilities or their surrounding environment.more » « less
- 
            Enabling efficient communication in artificial agents brings us closer to machines that can cooperate with each other and with human partners. Hand-engineered approaches have substantial limitations, leading to increased interest in methods for communication to emerge autonomously between artificial agents. Most of the research in the field explores unsituated communication in one-step referential tasks. The tasks are not temporally interactive and lack time pressures typically present in natural communication and language learning. In these settings, agents can successfully learn what to communicate but not when or whether to communicate. Here, we extend the literature by assessing emergence of communication between reinforcement learning agents in a temporally interactive, cooperative task of navigating a gridworld environment. We show that, through multi-step interactions, agents develop just-in-time messaging protocols that enable them to successfully solve the task. With memory—which provides flexibility around message timing—agent pairs converge to a look-ahead communication protocol, finding an optimal solution to the task more quickly than without memory. Lastly, we explore situated communication, enabling the acting agent to choose when and whether to communicate. With the opportunity cost of forgoing an action to communicate, the acting agent learns to solicit information sparingly, in line with the Gricean Maxim of quantity. Our results point towards the importance of studying language emergence through situated communication in multi-step interactions.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available